® 




Europaisches Patentamt 
European Patent Office 
Office europeen des brevets 



@ Publication number: 



0 335 739 

A2 



@ 



EUROPEAN PATENT APPLICATION 



@ Application number: 89303217.7 
@ Date of filing: 31.03.89 



@ lnt.CI.4: G 10 L 7/00 

G 06 K 9/64 



@ Priority: 31.03.88 JP 78827/88 

@ Date of publication of application: 
04.10.89 Bulletin 89/40 

@ Designated Contracting States: DE FR GB NL 



@ Applicant: KABUSHIKI KAISHA TOSHIBA 
72| Horikawa-cho Salwai-ku 
Kawasaki-shi Kanagawa-ken 210 (JP) 

@ Inventor: Segawa, Hideo c/o Kabushiki Kaisha Toshiba 
Patent Div. 1-1 Shibaura 1-chome Minato-ku 
Tokyo 105 (JP) 

@ Representative: Freed, Arthur Woolfetal 
MARKS & CLERK 57-60 Lincoln's Inn Fields 
London WC2A3LS (GB) 



CM 
< 

O 
CO 

1^ 

LO 
CO 

CO 



a. 

LU 



@ Pattern recognition system. 

(g) A pattern recognition system includes a feature extracting 
section (4) for extracting a feature of an input pattern, a memory 
section (8) for storing a reference pattern for each category, a 
similarity calculation section (7) for calculating a similarity 
between the feature obtained by the feature extracting section 
and the reference pattern stored in the memory section (8). and 
a posterior probability transformation section (9) for transfor- 
ming the similarity calculated by the similarity calculation 
section (7) into a posterior probability. The posterior probability 
transformation section (9) calculates the posterior probability 
calculated by using a parameter set required for calculating the 
posterior probability and calculated in recognition processing 
of each category in advance on the basis of the similarity 
calculated by the similarity calculation section (7) and a 
category thereof. 
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Description 

Pattern recognition system 

The present invention relates to a pattern recognition system capable of improving recognition accuracy by 
combining posterior probabilities obtained from similarity values (or differences between reference patterns 
5 and input patterns) of input acoustic units or input characters in pattern recognition such as speech 
recognition or character string recognition and, more particularly, to a pattern recognition system in which an 
apriori probability based on contents of a lexicon is reflected in a posterior probability. 

Known conventional pattern recognition systems recognize continuously input utterances or characters in 
units of word or character sequences. As one of such pattern recognition systems, a connected digit speech 
10 recognition algorithm using a method called a multiple similarity (MS) method will be described below. 

Continuously uttered input utterances in a system are divided into frames of predetermined times. For 
example, an input utterance interval [1, m] having 1st to m-th frames as shown in Fig. 1 will be described. In 
preprocessing of speech recognition, a spectral change is extracted each time one frame of an utterance is 
input, and word boundary candidates are obtained in accordance with the magnitude of the spectra! changes. 
15 That is, a large spectral change can be considered a condition of word boundaries. In this case, the term 
'word" means a unit of an utterance to be recognized. The referred speech recognition system is composed of 
a hierarchy of lower to higher recognition levels, e.g., a phoneme level, a syllable level, a word level and a 
sentence level. The "words" as units of utterances to be recognized correspond to a phoneme, a syllable, a 
word, and a sentence at the corresponding levels. Word recognition processing is executed whenever the 
20 'Nord boundary candidate is obtained. 

In the word sequence recognition processing, the interval (1, m] is divided into two partial intervals, i.e.. 
inter/als [1 . ki] and [ki, m]. ki indicates the frame number of the_i-th word boundary candidate. The interval [1 , 
ki] is an utterance interval corresponding to a word sequence wi, and the interval [ki, m] is a word utterance 
interval corresponding to a single word wi. A word sequence Wi is represented by: 
25 Wi = WI + wi ...(1) 

and corresponds to a recognition word sequence candidate of the utterance interval [1 . m] divided by the i-th 
frame. The recognition word sequence candidates Wi are obtained for all the word boundary candidates ki 
(i = 1, 2, C), Of these candidates thus obtained, a word sequence W having a maximum similarity value 
(value representing a similarity of this pattern with respect to a reference pattern) is adopted as a recognition 

30 word sequence of the utterance interval [1, m]. Note that £_ represents the number of recognition word 
sequence candidates corresponding to partial intervals to be stored upon word sequence recognition and is a 
parameter set in the system. By sequentially increasing m by this algorithm, recognition word sequences 
corresponding to all the utterance intervals can be obtained. 

In the above continuous speech recognition method, the number of input words is unknown. Therefore, in 

35 order to correctly recognize an input utterance pattern L as a word sequence W, whether each detected 
interval correctly corresponds to an uttered word must be considered. Even if this is considered, it is difficult 
to obtain a high recognition rate in the word sequence recognition as long as the similarity values are merely 
combined. This is because the similarity is not a probabilistic measure. 

Therefore, some conventional systems transform an obtained similarity value into a posterior probability and 

40 use this posterior probability as a similarity measure for achieving higher accuracy than that of the similarity. 
Assume that speech recognition is to be performed for an input word sequence 
77 = vv1 w2 ... wn wi e C 

including n words belonging to word set C = {c1, c2 cN} so as to satisfy the following two conditions: 

(1) A word boundary is correctly recognized. 
45 (2) The word category of each utterance interval is correctly recognized. 

In this case, as shown in Fig. 2, assume that each word wi corresponds to a pattern £\ in each partial 
utterance interval to satisfy the following relation: 
L - ri £2 ... £n 

in this case, if the word sequence W has no grammatical structure, wi and wj can be considered 
50 independent events (i 9^ j). Hence the probability that each utterance interval is correctly recognized to be a 
corresponding word is represented by the following equation: 

P(V7 I L) = u P(wi \ ix) • . . ( 2) 

55 i = l 

In this equation, P(W I L) is called likelihood. Upon calculation of the P(W | L), in order to prevent repetition of 
multiplication, logarithms of both sides of equation (2) are often taken to obtain logarithmic iikeiihood as 
follows: 

60 

log P(W 1 L) =.E log P(wi|jii) ...(3) 
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In this equation. P(wi I e\) is a conditional probability that an interval t\ corresponds to wi and is a posterior 
probability to be obtained. 

Therefore, by transforming an obtained similarity value into a posterior probability by a table, a high 
recognition rate can be obtained. 

Since it is practically difficult to obtain the posterior probability P(wl I ^i), however, a similarity value is 
normally used instead of a probability value, while properly biasing the similarity value to make it approximate 
to a probability value. For example, Ukita et al. performed approximation by an exponential function as shown 
in Fig. 3 ("A Speaker Independent Recognition Algorithm for Connected Word Boundary Hypothesizer, Proc. 
ICASSP, Tokyo. April, 1986): 

log P = S - Smax (S < Smax) 

0 (otherwise) ...(5) 

A logarithm of the equation (4) is calculated and the relation A-B^""^ = 1.0 is utilized to obtain the following 15 
equation: 

p = A»bS (S < Smax) 

1 (otherwise) ...(4) 20 

By subtracting a fixed bias Smax from similarity S, a similarity value is transformed into a probability value. 
When this measure is used in connected digit speech recognition, the bias Smax is set to be 0.96. 

A posterior probability curve, however, is not generally a fixed curve but a variable one depending on a size 
of a lexicon or the contents of the lexicon (e.g., the number of similar words is large). Therefore, the 25 
conventional method of transforming a similarity value into a posterior probability on the basis of only one fixed 
curve as described against many applications cannot perform recognition with high accuracy. 

As described above, in the conventional pattern recognition system for estimating similarity by transforming 
the similarity into a posterior probability, a transformation curve for obtaining the posterior probability is 
approximated to a fixed curve because it is difficult to obtain a curve corresponding to the contents of a 30 
lexicon or the number of words. Therefore, recognition cannot be performed with high accuracy. 

It is an object of the present invention to provide a pattern recognition system capable of performing 
recognition with high accuracy by performing similarity-posterior transformation on the basis of a parameter 
easily obtained by learning the training data belonging to the lexicon. 

The pattern recognition system according to the present invention performs posterior probability 35 
transformation processing for transforming a similarity value calculated from the feature vectors of an input 
pattern and a reference pattern for each category into a posterior probability calculated on the basis of the 
recognized category, the calculated similarity and a transformation parameter acquired from learning in 
advance. 

The transformation parameter is a parameter set including parameters for defining a distnbution of 40 
similarities of correctly recognized input patterns in recognition processing acquired from the similarity value 
training data of each category, parameters for defining a distribution of similarities of erroneously recognized 
input patterns in the recognition processing, and a weighting coefficient o required for calculating the 
posterior probability from the distributions of the two parameters. In transformation calculation, the posterior 
probability is calculated on the basis of the similarity value calculated and the above transformation parameter 45 
set. 

That is, in the pattern recognition process, predetermined calculation is performed by using transformation 
parameters corresponding to the recognition result, thereby transforming a similarity value into a desired 
postehor probability. In addition, the transformation requires complicated calculations. Therefore, by setting 
the calculation results into a table in advance, a processing speed can be increased. 50 

Therefore, according to the pattern recognition system of the present invention, a correct postenor 
probability transformation parameter can be obtained by a small number of samples, and the accuracy of 
recognition processing can be greatly improved by using the parameters. 

This invention can be more fully understood from the following detailed description when taken in 
conjunction with the accompanying drawings, in which: 

Fig. 1 is a view for explaining an input utterance interval and a word boundary candidate ; 
Fig. 2 is a view showing a correspondence between an utterance pattern and a word sequence ; 
Fig. 3 is a graph showing an approximate transformation function used in a conventional system; 
Fig. 4 is a block diagram of a continuous speech digit recognition system according an embodiment of 
the present invention; 

Fig. 5 is a block diagram showing an arrangement of a similarity-posterior probability transformation 
section of the system shown in Fig. 4; 

Fig. 6 is a flow chart showing parameter learning steps of the similarity-posterior probability 
transformation section of the system shown in Fig. 4; and 

Fig. 7 is a graph showing posterior probability curves obtained in the parameter learning steps of the 65 
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similarity-posterior probability transformation section of the system shown in Fig. 4. 
A principle of a pattern recognition system according to an embodiment of the present invention will be 
described below. 

In the system of the present invention, posterior probability transformation processing for transforming a 
5 similarity value calculated from a feature vector of an input pattern and a reference pattern for each category 
into a posterior probability is typically performed by a transformation parameter memory section and a 
transformation calculation section to be described below or by a table having functions of the two sections. 

The transformation parameter memory section stores, in units of categories, a parameter set including 
parameters (a, (J) for defining a distribution of similarity values_correctIy recognized in recognition processing 
10 derived from the similarity value training data parameters (a, P) for defining a distribution of similarities 
erroneously recognized in the recognition processing, and a weighting coefficient o) required for calculating a 
posterior probability from the distributions of the two parameters. 

The transformation calculation section calculates a posterior probability on the basis of a similarity value and 
the parameter set stored in the above transformation parameter memory section, 
15 Assuming that a partial utterance pattern £\ is classified into a word recognition result as its category and a 
similarity value (especially, a multiple similarity) , a posterior probability P(wi I £\) is rewritten as follows : 
P{wi 1 e\) — P(wi I Ti A si) (6) 

(where Ti is an event in which a recognized category of B in a multiple similarity method is wi, and si is a 
multiple similarity value of £\ concerning a word wi) 
20 Relation (6) can be transformed as follows by using the Bayes' theorem: 

rx^.A I A o.M - P(si I Ti A wi) >P(Ti A wi) 
I Ti A SI) ' P(Ti A wi) 

25 = P(si j Ti A wi)-P(Ti A wi)/ 

{P(si j Ti A wi)P(Ti A wi) 

+ P(si I Ti A wT)P(Ti A wT) } 
30 = P(si j Ti A wi)/{P(si [ Ti A wi) 

. P(si 1 Ti . „i,mti^) 

• (7) 

35 

where wi is an event in which a pattern £\ does not belong to the category wi. 
Statistics in the equation (7) will be described below. 
P(si I Ti A wi) will be described first. 

P(si I Ti A wi) is a probability that an event in which a recognized category obtained in the multiple similarity 
40 rr.ethod is wi and the category of input data is wi occurs. This curve can be approximated by the following 
equation: 

P(3i 1 Ti A wi) = - exp {- ii-^-^} 



45 



(8) 



where a and (3 are parameters obtained from training data: a represents the number of components not 

involved in the reference pattern in the multiple similarity method; and p. its mean variance. In this parameter 
50 estimation method, as described in "Distribution of Similarity Values in Multiple Similarity Method ' by Hideo 

Segawa et al. (Shingaku Giho PRU87-18, June 1987). and an effective amount of training data for parameter 

estimation is only several tens of samples. 
P(si I Ti A wi) will be described below. 

P(si 1 Ti A wi) is a probability that an event in which a recognized category in the multiple similarity method_is wi 
55 while the category of input data is not wi occurs. In continuous speech recogni^n, especially wi is 

problematic. Therefore, not only a combination of categories which easily causes wi to be erroneously 

recognized as wi. but also word contexts which are patterns not corresponding to a particular category 

involved in the lexicon and easily causing erroneous recognition such as : 

(1 ) Part of a certain word 
60 (Ex) "6 [roku]" 

— "6-9 [rokU'kyuu]" 

(2) Transient part between words 
(Ex) "3-1 [san-ichi]" 

—V "3-2-1 [san-ni-ichi]" 
65 (3) Combination of two word patterns 
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(Ex) "2-2[ni-ni)" 

"2 [ni]" must be examined, and their similarity distributions must be estimated. (Within the brackets are 
phonetic symbols indicating how the numerals are pronounced in the Japanese language.) Th£ similarity 
distribution can be approximated by the equation (8). Parameters in this similarity distribution^re (ai. pi) so as 
to be distinguished from the parameters_(ai. pi) in the equation (8). The parameters (ai. pi) can be easily 5 
calculated similarly to_the parameters (ai, pi). 

P(Ti A wi)/P(Ti A wi) will be taken into consideration. This statistic corresponds to an apriori probability in 
the Bayes' probability and to an occurrence frequency ratio of a category. P(Ti A wi) represents a probability 
that an event in which a recognition result obtained by a subspace method is wi and an input pattern is wi 
occurs. This statistic is calculated in a learning procedure as follows: 

P{Ti A wi)/P(Ti A wT) 

(frequency that events Ti and wi occur) 
(frequency that events Ti and wi occur) 



= 0) 
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The obtained co is a weighting coefficient. ^0 
As described above, each parameter set is a statistic which can be easily calculated by learning. 
In pattern recognition, a set of necessary parameters a, p. a, p and co are read out from the transformation 
parameter memory section in accordance with the obtained similarity si to perform a calculation based on the 
equation (7) in the transformation calculation section, thereby transforming the similarity value into a desired 
posterior probability. The transformation calculation section must perform complicated calculations. 25 
Therefore, by setting the results of transformation calculation into a table, a processing speed can be further 
increased. 

As a result, a posterior probability transforming means can be constituted by a small data amount with high 
accuracy, thereby improving recognition accuracy. 

A word sequence recognition system as the pattern recognition system according to the embodiment of the 30 
present invention based on the above principle will be described below. 

Fig. 4 shows an arrangement of the word sequence recognition system for connected digit speech 
recognition. 

Referring to Fig, 4, an utterance input section 1 transforms a continuous utterance into a predetermined 
electrical signal and supplies the signal to a preprocessor 2. The preprocessor 2 comprises an acoustic 35 
process section 3, a spectral change extraction section 4. an utterance start/end point determination section 
5, and a word boundary candidate generation section 6. The acoustic process section 3 performs spectral 
analysis for the input utterance data in units of frames by using a filter bank of, e.g., 8 to 30 channels, thereby 
extracting a feature pattern. The spectral change extraction section 4 extracts a difference AU between 
spectrum data Dm of each frame. The utterance start/end point determination section 5 detects the start and 40 
end points of the utterance on the basis of the magnitude of the extracted spectral change. When the spectral 
change AU is larger than a predetermined threshold value 6, the word boundary candidate generation 
section 6 outputs the corresponding frame as a word boundary candidate kl. 

The feature patterns corresponding to n word interval candidates [ki. m] obtained by the boundary 
candidates ki (i = 1 to n) are supplied to a word recognition section 7 and subjected to word recognition using 45 
a word dictionary 8 therein. A word recognition candidate of each word inten/al candidate is transformed into a 
posterior probability by a similarity-posterior probability transformation section 9 and supplied to a word 
sequence recognition section 10. The word sequence recognition section 10 combines a word sequence 
candidate for each word sequence interval [1, ki] (i = 1 to k) registered in a recognition word sequence 
candidate registration section 1 1 with the similarity transformed into the posterior probability to perform word 50 
sequence recognition. Word sequence recognition candidates thus obtained are stored in the recognition 
word sequence candidate registration section 1 1. When the utterance start/end point determination section 5 
detects the end point of the utterance, one of the registered word sequence candidates having a highest 
similarity is output as a recognized word. 

Fig. 5 shows an arrangement of the similarity-posterior probability transformation section 9. The section 9 55 
comprises a transformation calculation section 21 and a transformation parameter memory section 22. The 
transformation parameter memory section 22 is a table which stores parameters such as: 
a, (3 similarity distribution of correct patterns 
a, p similarity distribution of incorrect patterns 

CO apriori probability ratio of correct pattern to incorrect pattern ^0 
These parameter sets can be calculated by learning. 
Fig. 6 shows an algorithm of this learning. 

That is. this learning processing includes first and second learning steps 31 and 32. In the first learning step 
31, uttered word sequence data is divided or classified in accordance with word boundary data and a word 
category given as instructive data to form a reference pattern (template) of a word utterance based on the 65 
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multiple similarity method. In the second learning step 32, a word sequence is uttered again in accordance with 
the word boundary data and the word category given as the instructive data to generate a word utterance 
interval candidate, and a word similarity calculation with respect to the reference pattern (template) formed in 
the above first learning step is performed on the basis of the generated word interval candidate data, thereby 
5 obtaining word similarity data and a word recognition result. On the basis of the result and the given instructive 
data, correct and incorrect data similarity distributions and a category appearance frequency are calculated to 
obtain a posterior probability curve concerning a similarity value. 

The posterior probability curve obtained as a result of the above learning is shown in Fig. 7. 

When the learning is performed for all categories, parameters (ai, Pi, ai, pi, cd) for all the categories can be 
10 obtained. These parameters are stored in the transformation parameter memory section 22. 

The transformation calculation section 21 transforms similarities into the following equations: 

P(si I Ti A wi) 

= i - , ...<9) 

P(si 1 Ti A wi) 

(1 - si)^"^ , (1 - si) , 
= -^^ — exp { - -^^ =z } -..(lO) 

3icti*r(ai) 

25 and then calculates the posterior probability by the following transformation equation: 

.D(wi I Ti A si) 

- P(si I Ti A wi)/{P(si I Ti A wi) + P(si i Ti A wi)-a)} (11) 

As described above, according to the system of the present invention, the similarity-posterior probability 
transformation section can be easily formed by the simple learning processing, and the recognition 
30 processing can be performed with high accuracy by using the obtained transformation section. 

Upon transformation into a posterior probability, different transformation curves are preferably used for the 
respective recognition categories. Since a common transformation curve is used depending on the 
recognition category result, however, the following equation may be used: 

U 

P =.E^P(Wi 1 Ti A si)/N -..(12) 

40 In addition, since the transformation calculation section must perform complicated calculations, the 
transformation calculation section and the transformation parameter memory section may be combined into a 
table. As a result, a transformation speed can be increased. 

The present invention can be applied to not only speech recognition but also another pattern recognition 
such as character recognition. 
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Claims 



1. A pattern recognition system comprising: 
50 feature extracting means (4) for extracting a feature of an input pattern ; 

memory means (8) for storing a reference pattern for each category; 

similarity calculating means (7) for calculating a similarity between the feature vector extracted by said 

feature extracting means (4) and the reference pattern stored in said memory means (8); and 

posterior probability transforming means (9) for transforming the similarity calculated by said similarity 

55 calculating means (7) into a posterior probability, 

characterized in that said posterior probability transforming means includes means for calculating the 
posterior probability by using a parameter set required for calculating the posterior probability on the 
basis of the similarity calculated by said similarity calculating means and a category thereof, said 
parameter set obtained in recognition processing from training similarity value data in advance. 

60 2. A pattern recognition system comprising : 

feature extracting means (4) for extracting a feature of an input pattern ; 
memory means (8) for storing a reference pattern of each category; 

similarity calculating means (7) for calculating a similarity values between the feature vector extracted by 
said feature extracting means (4) and the reference pattern stored in said memory means (8) ; and 
65 posterior probability transforming mears i9) for transforming the similarity value calculated by said 
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similarity calculating means (7) into a posterior probability, 

characterized in that said posterior probability transforming means includes: 

transformation parameter memory means (22) for storing, for each category, a parameter set including 
parameters for defining a distribution of similarity values correctly recognized in recognition processing 
derived from training similarity value in advance, parameters for defining a distribution of similarities 5 
erroneously recognized in the recognition processing, and a weighting coefficient required for calculating 
the posterior probability from the two parameter distributions ; and 

calculating means (21) for calculating the posterior probability on the basis of the similarity calculated by 
said similarity calculating means (7) and the parameter set stored in said transformation parameter 
memory means. 

3. A system according to claim 2, characterized in that said transformation parameter memory means 
(22) includes means for storing, for each category, a parameter set including parameters a and P for 
defining the distribution of similarities correctly recognjzed in the recognition processing derived from the 
training similarity value data in advance, parameters a and p for defining the distribution of similarities 
erroneously recognized in the recognition processing, and a weighting coefficient co required for 15 
calculating the posterior probability from the two parameter distributions, 

4. A system according to claim 3, characterized in that said calculating means (21) includes means for 
transforming a similarity into the following equations on the basis of the simjlarity calculated by said 
similarity calculating means (7) and the parameter set including a, (3, a, 0 and co stored in said 
transformation parameter memory means (22) : 20 

P(si I Ti A wi) 

^ (1 - si)«-l _ (1 - si) 

Siai.rCai) ^ ^ 61 ^ 25 

P( si I Ti A wT) 

- (1 - si)"-l . (1 - si) . 

= — -. exp { - — } 30 

6iai-r(ai) Bi 

and calculating the posterior probability by the following transformation equation: 

P(wi 1 Ti A si) 

= P(si i Ti Awi)/{P(si I Ti A wi) + P(si I Ti A wi)-co}. 

5. A pattern recognition system comprising : 55 
feature extracting means (4) for extracting a feature of an input pattern ; 

memory means (8) for storing a reference pattern for each category; 

similarity calculating means (7) for calculating a similarity between the feature extracted by said feature 
extracting means (4) and the reference pattern stored in said memory means (8); and 

posterior probability transforming means (9) for transforming the similarity calculated by said similarity 40 
calculating means (7) into a posterior probability. 

characterized in that said posterior probability transforming means includes table memory means for 
storing a transformation table for calculating the posterior probability calculated by using a parameter set 
including a parameter for defining a distribution of similarity values correctly recognized in recognition 
processing derived from the training similarity value data in advance with respect to the reference pattern 45 
of each category on the basis of the similarity calculated by said similarity calculating means (7) and a 
category thereof, a parameter for defining a distribution of similarities erroneously recognized in the 
recognition processing, and a weighting coefficient required for calculating the posterior probability from 
the two parameter distributions, and the similarity calculated by said similarity calculating means (7) . 

6. A system according to claim 5, characterized in that said table memory means (9) includes means for 50 
storing, for each category, a posterior probability calculated by using the parameter set including 
parameters a and P for defining the distribution of similarity values correctly recognized in the recognition 
processing derived from training similarity value data of each category, parameters a and (3 for defining 

the distribution of similarities erroneously recognized in the recognition processing, and a weighting 
coefficient co required for calculating the posterior probability from the two parameter distributions, and 55 
the similarity calculated by said similarity calculating means (7). 

7. A system according to claim 6, characterized in that said table memory means (9) includes means for 
transforming a similarity into the following equations on the basis of the simjlarity calculated by said 
similarity calculating means (7) and the parameter set including a. p, a, p and o> . stored in said 
transformation parameter memory means: 
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P( si I Ti A wi ) 

= exp { - <^ - } 
P(si I Ti A wl) 

(1 - si)'a"l (1 - si) . 

- . exp { - =r } 

rc? ei«^-r(ai) 6i 

and storing the posterior probability calculated by the following transformation equation: 
P(wi I Ti A si) 

= P(si I Ti A wi)/{P(si I Ti A wi) + P(si I Ti A wi) - co}. 
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@ A pattern recognition system includes a feature 
extracting section (4) for extracting a feature of an 
input pattern, a nnemory section (8) for storing a 
reference pattern for each category, a similarity cal- 
culation section (7) for calculating a similarity be- 
tween the feature obtained by the feature extracting 
section and the reference pattern stored in the mem- 
ory section (8), and a posterior probability trans- 
formation section (9) for transfornning the similarity 
calculated by the similarity calculation section (7) 
into a posterior probability. The posterior probability 
transformation section (9) calculates the posterior 
probability calculated by using a parameter set re- 
quired for calculating the posterior probability and 
calculated in recognition processing of each cate- 
gory in advance on the basis of the similarity cal- 
culated by the similarity calculation section (7) and a 
category thereof. 
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